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Novel estrogen receptor 



This invention relates to the field of receptors 
i belonging to the superfamily of nuclear hormone 
receptors, in particular to steroid receptors. The 
invention relates to DNA encoding a novel steroid 
receptor, the preparation of said receptor, the receptor 
| protein, and the uses thereof- 

! Steroid receptors belong to a superfamily of nuclear 

! hormone receptors involved in ligand-dependent 

transcriptional control of gene expression. In addition, 
this superfamily consists of receptors for non-steroid 
hormones such as vit amine D, thyroid hormones and 
retinoids (Gigu^re et al, Nature 330, 624-629, 1987; 
Evans, R.M., Science 240, 839-895,1983). Moreover, a 
range of nuclear receptor-like sequences have been 
identified which encode socalled 'orphan' receptors: 
these receptors are structurally related to and therefore 
classified as nuclear receptors, although no putative 
ligands have been identified yet (B.W. Q'Maliey, 
Endocrinology 125, 1119-1170, 1989; D-J. Mangelsdorf and 
j R.M. Evans, Cell, 83, 841-850, 1995) . 

The superfamily of nuclear hormone receptors share a 
' modular structure in which six distinct structural and 
functional domains, A to F, are displayed (Evans, Science 
240, 889-895, 1988) . A nuclear hormone receptor is 
characterized by a variabel N-terminal region (domain 
A/B) , followed by a centrally located, highly conserved 
DNA-binding domain (hereinafter referred to as DBD; 
domain C) , a variable hinge region (domain D) , a 
conserved iigand-binding domain (herein after referred to 



as LBD; domain E) and a variable C-terzninal region 
(domain F) , 

The N-terninal region, which is highly variable in 
size and sequence/ is poorly conserved among the 
different members of the superf amily - This part of the 
receptor is involved in the modulation of transcription 
activation (Bocquel et al, Nucl. Acid Res., 17, 2581- 
2595, 1989; Tora et al/ Cell 59, 477-487, 1989) . 

The DBD consists of approximately 66 to 70 amino acids 
and is responsible for DNA-binding activity: it targets 
the receptor to specific DNA sequences called hormone 
responsive elements (hereinafter referred to as HRE) 
! within the transcription control unit of specific target 

genes on the chromatin (Martinez and Wahii, In "Nuclear 
■ Hormone Receptors', Acad. Press, 125-153, 1991). 

The LBD is located in the C- terminal part of the 
receptor and is primarily responsible for ligand binding 
activity. In this way, the LBD is essential for 
recognition and binding of the hormone ligand and, in 
addition possesses a transcription activation function, 
thereby determining the specificity and selectivity of 
the hormone response of the receptor* Although moderately 
conserved in structure, the L3D r s are known to vary 
| considerably in homology between the individual members 
! of the nuclear hormone receptor super family (Evans, 
; Science 240, 889-395, 1988; P.J. Fuller, FASEB J., 5, 
3092-3099, 1991; Mangelsdorf et al, Cell, Vol- 83, 835- 
839, 1995) . 

Functions present in the N-terminal region, LBD and 
DBD operate independently from each other and it has been 
shown that these domains can be exchanged between nuclear 
receptors (Green et al, Nature, Vol, 325, 75-78, 1987) . 
This results in chimeric nuclear receptors, such as 
described for instance in WO-A-8905355 . 
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When a hormone ligand for a nuclear receptor enters 
the cell by diffusion and is recognized by the L3D, it 
will bind to the specific receptor protein, thereby 
initiating an allosteric alteration of the receptor 
protein. As a result of this alteration the 
ligand/receptor complex switches to a transcriptionally 
active state and as such is able to bind through the 
presence of the DBD with high affinity to the 
corresponding HRE on the chromatin DNA (Martinez and 
Wahli, "Nuclear Hormone Receptors' , 125-153, Acad. Press, 
1991) . In this way the ligand/receptor complex modulates 
expression of the specific target genes. The diversity 
achieved by this family of receptors results front their 
ability to respond to different ligands. 



The steroid receptors are a distinct class of the 
20 nuclear receptor superfamily, characterized in that the 

putative ligands are steroid hormones. The receptors for 

! glucocorticoids (GR) , mineralcorticoids (MR) , 

i 

progesterone (PR) , androgens (AR) and estrogens (ER) are 
classical steroid receptors. Furthermore, the steroid 
25 i receptors have the unigue ability upon activation to bind 
to palindromic DNA sequences, the so called HRE's, as 
homodimers. The GR/ MR, PR and AR recognize the same DNA 
sequence, while the ER recognizes a different DNA 
sequence, (Beato et al, Cell. Vol. 83, 851-857, 1995)* 
30 i After binding to DNA, the steroid receptor is thought to 
interact with components of the basal transcriptional 
machinery and with sequence-specific transcription 
factors, thus modulating the expression of specific 
target genes. 
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Several HRE's have been identified, which are 
responsive to the hormone/ receptor complex. These HRE's 
are situated in the transcriptional control units of the 
various target genes such as mammalian growth hormone 
genes (responsive to G, E, T) , mammalian prolactin genes 
and progesterone receptor genes (responsive to E) , avian 
ovalbumin genes (responsive to P) f mammalian 
methalothionein gene (responsive to G) and mammalian 
hepatic ct^-globulin gene (responsive to A, E, T, G) . 



The steroid receptors have been known to be involved 
in embryonic development, adult homeostasis as well as 
organ physiology. Various diseases and abnormalities have 
been ascribed to a disturbance in the steroid hormone 
pathway* Since the steroid receptors exercise their 
influence as hormone-activated transcriptional 
modulators, it can be anticipated that mutations and 
defects in these receptors, as well as overstimulation or 
blocking of these receptors might be the underlying 
reason for the altered pattern. A better knowledge of 
these receptors, their mechanism of action and of the 
ligands which bind to said receptor might belp to create 
a better insight in the underlying mechanism of the 
hormone pathway, which eventually will lead to better 
treatment of the diseases and abnormalities linked to 
altered hormone/receptor functioning. 

For this reason cDNA' s of the steroid and several 
other nuclear receptors of several mammalians, including 
humans, have been isolated and the corresponding amino 
acid sequence have been deduced, such as for example the 
human steroid receptors PR, ER, GR, MR, and AR, the human 
non-steroid receptors for vitamine D, thyroid hormones, 
and retinoids such as retinol A and retinoic acid. In 



addition, cDNA' s well over 100 mammalian orphan receptors 
have been isolated, for which no putative ligands are 
known yet (Mangelsdcrf et al, Cell, Vol.83, 835-839, 
| 1995) .However, there is still a great need for the 
j elucidation of other nuclear receptors, in order to 

! unravel the various roles these receptors play in normal 

i 

| physiology and pathology* 

i 

i 

i 

| The present invention provides for such a novel 

j nuclear receptor .Mere specific, the present invention 

| provides for novel steroid receptors, having estrogen 

i| mediated activity. Said novel steroid receptors are novel 

|| estrogen receptors, which are able to bind and be 

Jj activated by, for example, estradiol, estrone and 

i 

| estriol. 

According to the present invention it has been found 
! that a novel estrogen receptor is expressed as an 8 kb 
transcript in human thymus, spleen, peripheral blood 
lymphocytes (PBLs) , ovary and testis. Furthermore, 
! additional transcripts have been identified- In testis, 
| an additional transcript of 1.3 kb was detected. Another 
I transcript of approximately 10 kb was identified in 
| ovary, thymus and spleen. These two transcripts are 
! probably generated by alternative splicing of the gene 
i; encoding the novel estrogen receptor according to the 
invention. 

: Cloning of the cDNA' s encoding the novel estrogen 
\ receptors according to the invention revealed that 
' several splicing variants of said receptor can be 
distinguished. At the protein level, these variants 

I differ only at the C-terminal part. 



It is true that an estrogen receptor is already known: 
cDNA encoding the classical ER was isolated (Green, et 
al, Nature 320, 134-139, 1386; Greene et al, Science 231, 
1150-1154, 1986), and its amino acid sequence deduced. 
Although both ER' s share a great deal of homology, the 
amino acid sequence of both receptors vary considerably. 
The homology between the classical ER and the novel ER' s 
according to the invention resides predominantly in the 
DBD' s and LBB's of said receptors. Thus, the two 
receptors are distinct, encoded for by different genes, 
which belong to the subclass of estrogen receptors* 

! Furthermore, two orphan receptors, ERRcx and ERRp, 

I 

having an estrogen receptor related structure have been 
described • Based on the structural reiatedness of ERRa 
and ERRJJ with the classic ER, these orphans are 
| considered to be members of the estrogen receptos 
| subclass- These receptors, however, have not been 
j reported to be able to bind estrodial or any other 
I hormone that binds to the classic ER, and other ligands 
! which bind to these receptors have not been found yet 
(ref?) . The novel estrogen receptor according tc the 
invention distinguishes itself clearly froxu these 
| receptors since it was found to bind estrogens, 
! The fact that a novel ER according to the invention 

;! has been found is all the more surprising, since any 

suggestion towards the existence of additional estrogen 
i receptors was absent in the scientific literature: 
\ neither the isolation cf the classical ER nor the orphan 
receptors ERRa and ERR{3 suggested or hinted towards the 
presence of additional estrogen receptors such as the 
receptors according to the invention. The identification 
of additional ER' s could be a major step forward for the 
! existing clinical therapies, which are based on the 
: presence of one ER and as such ascribe all estrogen 



• • 



mediated abnormalities and/or diseases to this one 
receptor. The presence of additional estrogen receptors, 
such as the receptors according to the invention will be 
useful in the development of hormone analogs that 
selectively activate either the classic ER or the novel 
estrogen receptor according to the invention. This should 
be considered as one of the major advantages of the 
present invention* 

Thus, in one aspect, the present invention provides 
for isolated cDNA encoding a novel steroid receptor. In 
particular, the present invention provides for isolated 
cDNA encoding a novel estrogen receptor. 

According to this aspect of the present invention, 
there is provided an isolated DNA encoding a steroid 
receptor protein having an N-terminal domain, a DNA- 
binding domain and a ligand-binding domain, wherein the 
amino acid sequence of said DNA-binding domain of said 
receptor protein exhibits at least 80% homology with the 
amino acid sequence shown in SEQ ID NO: 3, and the amino 
acid sequence of said ligand-binding domain of said 
receptor protein exhibits at least 10% homology with the 
amino acid sequence shown in SEQ ID NO: 4. 

In particular, the isolated DNA encodes a steroid 
receptor protein having an N-terminal domain f a DNA- 
binding domain and a ligand-binding domain , wherein the 
amino acid sequence of said DNA-binding domain of said 
receptor protein exhibits at least 90%, preferably 95%, 
more preferably 98%, most preferably 100% homology with 
the amino acid sequence shown in SEQ ID NO: 3. 

More particularly, the isolated DNA encodes a steroid 
receptor protein having an N-terminal domain, a DNA- 
binding domain and a ligand-binding domain , wherein the 
amino acid sequence of said ligand-binding domain of said 
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receptor protein exhibits at least 75%, preferably 80%, 
more preferably 90%, most preferably 100% homology with 
the amino acid sequence shown in SEQ ID NO: 4* 

A preferred isolated DNA according to the invention 
encodes a steroid receptor protein having the amino acid 
sequence shown in SEQ ID MO: 5, SEQ ID NO: 6 or SEQ ID 
NO:21. 

A more preferred isolated DNA according to the 
invention is an isolated DNA comprising a nucleotide 
sequence Shown in SEQ ID N0:1, SEQ ID NO: 2 or SEQ ID 
NO:20. 

The DNA according to the invention may be obtained 
from cDNA. Alternatively, the coding sequence might be 
genomic DNA, or prepared using DNA synthesis techniques. 

The DNA according to the invention will be very useful 
for in vivo expression of the novel receptor proteins 
according to the invention in sufficient quantities and 
in substantially pure form. 
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In another aspect of the invention, there is provided 
for a steroid receptor comprising the amino acid sequence 
encoded by the above described DNA molecules. 

The steroid receptor according to the invention has an 
N-terminal domain, a DNA-binding domain and a ligand- 
binding domain , wherein the amino acid sequence of said 
DNA-binding domain of said receptor exhibits at least 80% 
homology with the amino acid sequence shown in SEQ ID 
NO: 3, and the amino acid sequence of said ligand-binding 
domain of said receptor exhibits at least 70% homology 
with the amino acid sequence shown in SEQ ID NO: 4* 

In particular, the steroid receptor according to the 
invention has an N-terninal domain, a DNA-binding domain 
and a ligand-binding domain , wherein the amino acid 
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sequence of said DNA-binding domain of said receptor 
exhibits at least 90%, preferably 95%, more preferably 
98%, most preferably 100% homology with the amino acid 
sequence shown in SEQ ID NO: 3. 

More particular, the steroid receptor according to the 
invention has an N-terminal domain, a DNA-binding domain 
and a iigand-bindinq domain , wherein the amino acid 
sequence of said ligand-binding domain of said receptor 
exhibits at least 75%, prefearbly 80%, more preferably 
10 < 90%, most preferably 100% homology with the amino acid 
sequence shown in SEQ ID NO: 4. 

It will be clear for those skilled in the art that 
also steroid receptor proteins comprising combined DBD 
and LBD preferences and DNA encoding such receptors are 
subject of the invention. 

Preferably, the steroid receptor according to the 
invention comprises an amino acid sequence shown in SEQ 
ID NO: 5, SEQ ID NO: 6 or SEQ ID N0:21. 

Also within the scope of the present invention are 
20 ! steroid receptor proteins which comprise variations in 
the amino acid sequence of the DBD and LBD without 
loosing their respective DNA-binding or ligand-binding 
activities- The variations that can occur in those amino 
acid sequence comprise deletions, substitutions, 
25 insertions, inversions or additions of (an) amino acid(s) 

!■ in said sequence, said variations resulting in amino acid 
I: difference (s) in the overall sequence. It is well known 
I in the art of proteins and peptides that these amino acid 
differences lead to amino acid sequences that are 
30 : ' different from, but still homologous with the native 
amino acid sequence they have been derived from. . 
: * Amino acid substitutions that are expected not to 

essentially alter biological and immunological 
! activities, have been described in for example Dayhof, 
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M.D., Atlas of protein sequence and structure, Nat. 
Biomed, Res, Found./ Washington D.C., 1978, vol. 5, 
suppl . 3- Amino acid replacements between related amino 
acids or replacements which have occurred frequently in 
evolution are, inter alia Ser/Ala, Ser/Gly, Asp/Gly, 
Asp/Asn, Ile/Val. Based on this information Lipman and 
Pearson developed a method for rapid and sensitive 
protein comparison (Science 227, 1435-1441, 1935) and 
determining the functional similarity between homologous 
polypeptides . 

Variations in amino acid sequence of the DBD according 
to the invention resulting in an amino acxd sequence that 
has at least 80% homology with the sequnece of SEQ ID 
NO: 3 will lead to receptors still having sufficient DMA 
binding activity. Variations in amino acid sequence of 
the LBD according to the invention resulting in an amino 
acid sequence that has at least 70% homology with the 
sequnece of SEQ ID NO: 4 will lead to receptors still 
having sufficient ligand binding activity. 

Homology as defined herein is expressed in 
percentages, determined via PCGENE. 



Comparing the amino acid sequence of the classic ER 
and the ER' s according to the invention revealed a high 
degree of similarity within their respective DBD's, The 
conservation of the F-box (amino acids E-G-X-X-A) which 
is responsible fcr the actual interactions of ERa with 
the target DNA element (Zilliacus et al., Mcl.Endo- 9, 
389, 1995; Glass, End. Rev. 15, 391, 1994), is indicative 
for a recognition of estrogen responsive elements (ERE's) 
by the ER' s according to the invention. Therefore, the 
classical ER and novel ER' s according to the invention 
may have overlapping target gene specificities. This 
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could indicate that in tissues which co-express both 
respective ER's, these receptors compete for ERE/s. The 
er's according to the invention may regulate 
transcription of target genes differently from classical 
ER regulation or could simply block classical BR 
functioning by occupying estrogen responsive elements. 

Thus, a prefered steroid receptor according to the 
invention comprises the amino acid sequence E-G-X-X-A 
within the P box of the DNA binding domain, wherein X 
stands for any amino acid- Also within the scope of the 
invention is isolated DNA encoding such a receptor. 

Methods to prepare the receptors according to the 
invention are well known in the art (Sambrook et al • , 
Molecular Cloning; a Laboratory Manual, Cold Spring 
Harbor Laboratory Press, Cold Spring Harbor, latest 
edition) • The most practical approach is to produce these 
receptors by expression of the DNA encoding the desired 
protein . 

A wide variety of host cell and cloning vehicle 
combinations may be usefully employed in cloning the 
nucleic acid sequence coding for the receptor of the 
invention. For example, useful cloning vehicles iaay 
include chromosomal, non-chromosomal and synthetic DNA 
sequences such as various known bacterial plasmids and 
wider host range plasmids and vectors derived from 
combinations of plasmids and phage or virus DNA. Useful 
hosts may include bacreriai hosts, yeasts and other 
fungi, plant or animal hosts, such as Chinese Hamster 
Overy (CHO) cells or monkey ceils and other hosts. 

Vehicles for use in expression of the ligand-binding 
domain of the present invention will further comprise 
control sequences operably linked to the nucleic acid 
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sequence coding for the ligand-binding domain. Such 
control sequences generally comprise a promoter sequence 
and sequences which regulate and/or enhance expression 
levels. Furthermore an origin of replication and/or a 
dominant selection marker are often present in such 
vehicles. Of course control and other sequences can vary 
depending on the host cell selected. 

Techniques for transforming or transfecting host cells 
are quite known in the art (see, for instance, Maniatis 
et al., Molecular Cloning: A Laboratory Manual, Cold 
Spring Harbor Laboratory, 1982 and 1989) , 

Recombinant expression vectors comprising the DNA of 
the invention as well as cells transformed with said DNA 
or said expression vector also form part of the present 
invention. 



In a further aspect of the invention, there is 
provided for a chimeric receptor protein having an N- 
terminal domain, a DNA-binding domain, and a ligand- 
binding domain, characterized in that at least one of the 
domains originates from a receptor protein according to 
the invention, and at least one of the other domains of 
said chimeric protein originates from another receptor 
protein from the nuclear receptor superfamily, provided 
that the DNA-binding domain and the ligand-binding domain 
of said chimeric receptor protein originate from 
different proteins. 

In particular, the chimeric receptor according to the 
invention comprises the LSD according to the invention,, 
said LBD having an amino acid sequence which exhibits at 
least 7 0% homology with the amino acid sequence shown in 
SEQ ID NO: 4. In that case the N- terminal domain and DBD 
should be derived from another nuclear receptor, such as 



for example PR. In this way a chimeric receptor is 
constructed which is activated by a ligand of the ER 
according to the invention and which targets a gene under 
control of a progesterone responsive element. The 
j chimeric receptors having a LBD according to the 
1 invention are useful for the screening of compounds to 
J identify novel ligands or hormone analogs which are able 
to activate an ER according to the invention* 

In addition, chimeric receptors comprising a DBD 
according to the invention, said DBD having an amino acid 
seqeunce exhiting at least 80% homology with the amino 
acid sequence shown in SEQ ID NO: 3, and a LBD and, 
optionally/ an N- terminal domain derived from another 
nuclear receptor, can be succesfully used to identify 
novel ligands or hormone analogs for said nuclear 
receptors. Such chimeric receptors are especially useful 
for the identification of the respective ligands of 
orphan receptors. 

Since steroid receptors have three domains with 
different functions, which are more or less independent, 
it is possible that all three functional domains have 
been derived from different members of the steroid 
receptor superfamily. 

Molecules which contain parts having a different 
origin are called chimeric. Such a chimeric receptor 
; comprising the ligand^binding domain and/or the DNA- 
binding domain of the invention may be produced by 
chemical linkage, but most preferably the coupling is 
i accomplished at the DNA level with standard molecular . 
ji biological methods by fusing the nucleic acid sequences 
| encoding the necessary steroid receptor domains -Hence, 

! DNA encoding the chimeric receptor proteins according to 

i 

the invention are also subject of the present invention. 
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Such chimeric proteins can be prepared by transfecting 
DNA encoding these chimeric receptor proteins to suitable 
host cells and culturing these cells under suitable 
conditions . 

It is extremely practical if, next to the information 
for the expression of the steroid receptor, also the host 
cell is transformed or transfected with a vector which 
carries the information for a reporter molecule. Such a 
vector coding for a reporter molecule is characterized by 
having a promoter sequence containing one or more hormone 
responsive elements (HRE) functionally linked to an 
operative reporter gene. Such a HRE is the DNA target of 
the activated steroid receptor and, as a consequence, it 
enhances the transcription of the DNA coding for the 
reporter molecule. In in vivo settings of steroid 
receptors the reporter molecule comprises the cellular 
response to the stimulation of the ligand. However, it is 
possible in vitro to combine the ligand-binding domain of 
a receptor to the DNA binding domain and transcription 
activating domain of other steroid receptors, thereby 
enabling the use of other HRE and reporter molecule 
systems* One such a system is established by a HRE 
presented in the MMTV-LTR (mouse mammary tunor virus long 
terminal repeat sequence in connection with a reporter 
molecule like the firefly luciferase gene or the 
bacterial gene for CAT (chloramphenicol transferase) . 
Other HRE's which can be used are the rat oxytocin 
promotor, the retinoic acid responsive element/ the 
thyroid hormone responsive element/ the estrogen 
responsive element and also synthetic responsive elements 
have been described (for instance in Fuller, ibid, page 
3096) . As reporter molecules next to CAT and luciferase 
fi-galactosidase can be used. 
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Steroid receptors and chimeric receptors according to 
the present invention can be used tor the in vitro 
identification of novel iigands or hormonal analogs. For 
this purpose binding studies can be performed with ceils 
transformed with DNA according to the invention or an 
expression vector comprising DNA according to the 
invention, said cells expressing the steroid receptors or 
chimeric receptors according to the invention. 

The novel steroid receptor and chimeric receptors 
according to the invention as well as the ligand-binding 
domain of the invention, can be used in an assay for the 
identification of functional Iigands or hormone analogs 
for the nuclear receptors. 

Thus, the present invention provides for a me -hod for 
identifying functional Iigands for the steroid receptors 
and chimeric receptors according to the invention, said 
method comprising the steps of 

a) introducing into a suitable host cell 1) DNA or 
an expression vector according to the 
invention, and 2) a suitable reporter gene 
functionally linked to an operative hormone 
response element, said HRE being able to be 
activated by the DNA-binding domain of the 
receptor protein encoded by said DNA; 

b) bringing the host cell from step a) into 
contact with potential Iigands which will 
possibly bind to the ligand-binding domain of 
the receptor protein encoded by said DNA from 
step a) ; 



c) monitoring the expression of the receptor 

protein encoded by said reporter gene of step 
a; , 

If expression of the reporter gene is induced with 
respect to basic expression (without ligand) / the 
functional ligand can be considered as an agonist; it 
expression of the reporter gene remains unchanged or is 
reduced with respect to basic expression, the functional 
ligand can be a suitable (partial) antagonist. 

For performing such kind of investigations host cells 
which have been transformed or trans fected with both a 
vector encoding a functional steroid receptor and a 
vector having the information for a hormone responsive 
element and a connected reporter molecule are cultured in 
a suitable medium* After addition of a suitable ligand, 
i which will activate the receptor the production of the 
reporter molecule will be enhanced, which production 
simply can be determined by assays having a sensitivity 
I for the reporter molecule. See for instance WO-A-58031 63 , 
! Assays with known steroid receptors have been described 
j (for instance S. Tsai et al., Cell 57, 443, 1989; M. 
1 Meyer et al., Cell 57, 433, 1989). 



Legends to the figures 

i 

Figure 1* 

Northern analysis of the novel estrogen receptor -Two 
different multiple tissue Northern blots (Clontech) were 
! hybridised with a specific probe for the novel estrogen 
receptor (see examples) . Indicated are the human tissues 
the RNA originated from and the position of the size 
markers in kilobases (kb) . 



Figure 2* 

Histogram showing the 3- to 4-fold stimulatory effect 
of ethinyl-estradiol, estricl and estrone on the 
luciferase activity mediated by the novel estrogen 
receptor- An expression vector encoding the novel 
estrogen receptor was transiently transfected into CHO 
cells together with a reporter construct containing the 
rat oxytocin promoter in front of the firefly luciferase 
encoding sequence (see examples) . 



Examples 

A* Mblecular cloning of the novel estrog&n receptor. 

Two degenerate oligonucleotides containing inosines 
(I) were based on conserved regions of the DNA-binding 
domains and the ligand-binding domains of the human 
steroid hormone receptors. 

Primer 

5' -GGIGA(CZT) GA(A/G) GC (A/T) TCIGGITG (C/T) CA (C/T) TA(C/T) 
TA(C/T)GG-3 / (SEQ ID N0:7). 
Primer 

5' -AAGCCTGG (C/G)A (C/T) IC (G/T) (C/T) TTIGCCCAI (C/T) TIAT-3' 
SEQ ID NO : 8 ) * 

As template, cDNA from human EBV-s t imulated PBLs 
(peripheral blood leukocytes) was used. One microgram of 
total RNA was reverse transcribed in a 20 jil reaction 
containing 50 rnM KC1, 10 mM Tris-HCl ?H 9.3, 4 mM MgC12, 
1 mM dNTPs (Pharmacia) , 100 pmoi random hexanucleotides 
(Pharmacia) , 30 Units Rnase inhibitor (Pharmacia) and 2C0 



Units M-MLV Reverse transcriptase (Gibco BRL) . Reaction 
mixtures were incubated at 37°C for 30 minutes and heat- 
inactivated at 100°C for 5 minutes- The cDNA obtained was 
used in a 100 nil PCR reaction containing 10 raM Tris-HCl 
pH 8.3, 50 mM KC1, 1-5 iriM MgC12, 0-001% gelatin (w/v) , 3% 
DMSO, 1 microgram of primer #1 and primer #2 and 2.5 
Units of Amplitaq DNA polymerase (Perkin Elmer) . PCR 
reactions were performed in the Perkin Elmer 960C thermal 
cycler. The initial denaturation (4 minutes at 94°C) was 
followed by 36 cycles with the following conditions: 30 
sec 94°C, 30 sec* 45°C, 1 minute 72°C and after 7 minutes 
at 72°C the reactions were stored at 4°C. Aliquots of 
these reactions were analysed on a 1*5% agarose gel. 
Fragments of interest were cut out of the gel, 
reamplified using identical PCR-conditions and purified 
using Qiaex II (Qiagen) ♦ Fragments were cloned in the 
PCRII vector and transformed into bacteria using the TA- 
cioning kit (Invitrogen) . Plasmid DNA was isolated for 
nucleotide sequence analysis using the Qiagen plasmid 
midi protocol (Qiagen) . Nucleotide sequence analysis was 
performed with the ALF automatic sequencer (Pharmacia) 
using a T7 DNA sequencing kit (Pharmacia) with vector- 
specific or fragment-specific primers- 

One cloned fragment corresponded to a novel estrogen 
receptor (ER> which is closely related to the classical 
estrogen receptor. Part of the cloned novel estrogen 
receptor fragment (nucleotides 466 to 797 in SSQ ID 1) 
was amplified by PCR using oligonucleotide #3 
TGTTACGAAGTGGGAATGGTGA (SEQ ID NO: 9) and oligonucleotide 
#2 and used as a probe to screen a human testis cDNA 
library in \gtll (Clontech #HL1010b) . Recombinant phages 
were plated (using Y1090 bacteria grown in LB medium 
supplemented with 0,2% maltose) at a density of 40.000 



per 135 mm dish and replica filters (Hybond-N, Amersham) 
were made as described by the supplier. Filters were 
prehybridised in a solution containing 0.5 M pho$phate 
buffer CpH 7.5) and 7% SDS at 65°C for at least 30 
minutes. DNA probes were purified with Qiaex II (Qiagen) , 
32P- labeled with a Decaprine kit (Ambion) and added to 
the prehybridisation solution. Filrers were hybridised at 
65°C overnight and then washed in 0.5 X SSC/0.1% SDS at 
65°C Two positive plaques were identified and could be 
shown to be identical. These clones were purified by 
rescreening one more tirae. A PCR reaction on the phage 
eluates with the A.gtil-specif ic primers #4: 5' - 
TTGACACCAGACCAACTGGTAATG-3' (SEQ ID NO: 10) and #5: 5' - 
GGTGGCGACGACTCCTGGAGCCCG- 3 ' (SEQ ID NO: ID yielded a 
fragment of 170 0 basepairs on both clones. Subsequent PCR 
reactions using combinations of a gene-specific primer 
#6: 5' -GTACACTGATTTGTAGCTGGAC-3' (SEQ ID NO: 12) with the 
Jtgtll primer #4 and gene-specific primer #7: 5'- 
CCATGATGATGTCCCTGACC-3' (SEQ ID NO: 13) with XgtU primer 
primer #5 yielded fragments of 450 bp and 1000 bp, 
respectively, which were cloned in the PCRII vector and 
used for nucleotide sequence analysis. The conditions for 
these PCR reactions were as described above except for 
the primer concentrations (200 ng of each primer) and the 
annealing temperature (60°C) . Since in the cDNA clone the 
homology with the ER is lost abruptly at a site which 
corresponds to the exon 7/exon 8 boundary in the ER, it 
was suggested that this sequence corresponds to intron 7 
of the novel ER gene. For verification of the nucleotide 
sequences of this cDNA clone, a 1200 bp fragment was 
generated on the cDNA clone with XgtU primer #4 with a 
gene-specific primer #S corresponding to the 3' end of 



20 



exon 7: 5' -TCGCA7GCCTGACGTGGGAC-3' (SEQ ID NO: 14) using 
the proofreading Pfu polymerase (Stratagene) - This 
fragment was also cloned in the PCRII vector and 
completely sequenced and was shown to be identical to the 
sequences obtained earlier. 

To obtain nucleotide sequences of the novel ER 
downstream of exon 7, a degenerate oligonucleotide based 
on the AF-2 region of the classic ER (#9: 5'- 
GGC(C/G)TCCAGCATCTCCAG(C/G)A(A/G)CAG-3' ; SEQ ID NO: 15) 
was used together with the gene-specific oligonucleotide 
#10: 5 f -GGAAGCTGGCTCACTTGCTG-3' (SEQ ID NO: 16) using 
testis cDNA as template (Marathon ready testis cDNA, 
Clontech Cat #7414-1) . A specific 220 bp fragment 
corresponding to nucleotides 1112 to 1332 in SEQ ID No. 1 
was cloned and sequenced and was shown to contain high 
homology with the corresponding region in the classic ER. 
In order to obtain sequences of the novel ER downstream 
of the AF-2 region, RACE (rapid amplification of cDNA 
ends) PCR reactions were performed using the Marathon- 
ready testis cDNA (Clontech) as template. The initial PCR 
was performed using oligonucleotide #11: 5'- 
TCTTGTTCTGGACAGGGATG-3' (SEQ ID NO: 17) in combination 
with the API primer provided in the kit. A nested PCR was 
performed on an aliquot of this reaction using 
oligonucleotide #10 in combination with the oligo dT 
primer provided in the 3cit. Subsequently, an aliquot of 
this reaction was used in a nested PCR using 
oligonucleotide #12: 5' -GCATGGAACATCTGCTCAAC-3' (SEQ ID 
NO: 18) in combination with the oligo dT primer. 
Nucleotide sequence analysis of a specific fragment that 
was obtained (corresponding to nucleotides 1256 to 1431 
in SEQ ID NO 1) revealed a sequence encoding the 
carboxyterirtinus of the novel ER ligand-binding domain, 
including an F-domain and a translational stopcodon. 



B. Identification of tiro splice variants of the no^el 
estrogen receptor. 

Rescreening of the testes cDNA library with a probe 
corresponding to nucleotides 917 to 1248 in SEQ ID No. 1 
yielded two hybridizing clones, the 3' end of which were 
amplified by PCR (gene-specific primer #8: 5'- 
GGAAGCTGGCTCACTTGCTG-3' together with primer #4), cloned 
and sequenced. One clone was shown to contain an 
alternative exon 3 (exon 8B) of the novel ER. As a 
consequence of the introduction of this exon through a 
specific alternative splicing reaction, the reading frame 
encoding the novel ER is immediately terminated, thereby 
creating a truncation of the carboxyterminus of the novel 
ER. 

Screening of a human thymus cDNA library (Clontech 
HLlC74a) with the probe corresponding to nucleotides 935 
to 12 66 in SEQ ID No. 1, revealed another splice variant. 
The 3 r end of one hybridizing clone was amplified using 
primer 48 with the XgtlO-specif ic primer #13 5'- 
AGCAAGTTCAGCCTGTTAAGT-3' (SEQ ID NO: 19), cloned in the 
PCRII vector and sequenced. The obtained nucleotide 
sequence upstream of the exon 7/exon 8 boundary were 
identical to the clones identified earlier. However, an 
alternative exon 8 (exon 8C> was present at the 3' end 
encoding two C-terminal amino acids followed by a stop- 
codon. 

These two variants of the novel estrogen receptor do 
not contain the AF-2 region and therefore probably lack 
the ability to modulate transcription of target genes in 
a ligand- dependent fashion. However, the variants 
potentially could interfere with the functioning of the 



wild-type classic ER ar.d/cr the wild-type novel ER, 
either by heterodimerization or by occupying estrogen 
response elements. A mutant of the classic ER (ERl-530) 
has been described which closely resembles the two 
variants of the novel estrogen receptor described above. 
ERl-530 has been shown to behave as a dominant -negative 
receptor i.e. it can block the intracellular activity of 
the wild type ER (Ince et al, J. Biol. Cham. 268/ 14026- 
14032, 1993) . 



C. Northern blot: analysis. 

Human multiple tissue Northern blots (MTN-blots) were 
purchased from Clontech and prehybridized for at least 1 
hour at 65°C in 0.5 M phosphate buffer pH 7 . 5 with 7% 
SDS. DNA fragments that were used as probes were 32P- 
labeled using a labelling kit (Ambion) , denatured by 
boiling and added to the prehybridisation solution. 
Washing conditions were: 3X SSC at room temperature, 
followed by 3 X SSC at 65°C, and finally 1 X SSC at 65°C. 
The filters were than exposed to X-ray films for one 
week. Two transcripts of approximately 8 kb and 10 kb 
were detected in thymus, spleen, ovary and testis. In 
addition, a 1.3 kb transcript is detected in testis. 

D. iigand -dependent transcription activation by the 
novel estrogen receptor protein. 

Cell culture 

Chinese Hamster Ovary (CHO Kl) cells were obtained 
from ATCC (CCL61) and maintained at 37°C in a humidified 
atmosphere (5% C0 2 ) as a monolayer culture in fenolred- 
free M505 medium. The latter medium consists of a mixture 



(1:1) of Dulbecco's Modified Eagle's Medium (DMEM, Gibco 
074-200) and Nutrient Medium F12 (Ham' s F12, Gibco C74- 
1700) supplemented wirh 2.5 mg/ml sodium carbonate 
(Baker), 55 \xg/ml sodium pyruvate (Fluka) , 2-3 ng/rnl 0- 
mercaptoethanol (Baker), 1*2 fig/ml ethanolamine (Baker), 
360 ng/ml L-glutamine (Merck), 0.45 mg/ml sodium selenite 
(Fluka), 62.5 ng/mi penicillin (Mycopharm) , 62-5 ng/ml 
streptomycin (Serva) , and 5% charcoal-treated bovine calf 
j serum (Hyclone) . 

Recombinant vectors 

The novel ER encoding sequence as presented in SEQ ID 
No 1 was amplified by PCR using oligonucleotides 5'- 
CTTGGATCCATAGCCCTGCTGTGATGAATTACAG-3' (SEQ ID NO: 22) 
(underlined is the translation initiation codon) and : 
5 ' -GAT GGATCCTCACCTCAGGGCCAGGCGTCAC TG- 3 r (SEQ ID NO;23) 
(underlined is the translation stopcodon, antisense) . The 
resulting BamHl fragment (approximately 1450 base pairs) 
was then cloned in the mammalian cell expression vector 
pNGVl behind the SV4 0 early promoter- In addition, this 
vector contains the IgG and Mulv enhancers. 

The reporter expression vector was based on the rat 
oxytocin gene regulatory region (position -363/+16 as a 
Hindlll/ Mbol fragment; R.Ivell, and D.Richter, 
Proc. Natl. Acad. Sci. USA 81/ 2006-2010, 1984) linked to the 
firefly luciferase encoding sequence; the regulatory 
region of the oxytocin gene were shown to possess 
functional estrogen hormone response elements in vitro 
for both the rat (R,Adan, N,Walther, J, Cox, R.Iveli, and 
P.Burbach, Biochem . Biophys . Res . Comm. 175 , 117-122, 1991) 
.and the human (S.Richard, and H.Zingg, J.Biol.Chem. 265 , 
6098-6103, 1990) . 

Transient trans feet ion 



1 x 10 a CHO cells were seeded in 6-wells Nur.clon tissue 
culture plates and DNA was introduced by use of 
lipofectin (Gibcc BRL). Hereto, the DNA (1 ug of both 
receptor and reporter vector in 250 [ih Optimem, Gibco 
BRL) was mixed with an equal voluiae of lipofectin reagent 
(7 uL in 250 uL Optimem, Gibco) and allowed to stand at 
room temperature for 15 min. After washing the cells 
twice with serum-free medium (M505) new medium (50 0 uL 
Optimem, Gibco) was added to the cells followed by the 
dropwise addition of the DNA- lipofectin mixture. After 
incubation for a 5 hour period at 37°C cells were washed 
twice with fenolred-free M505 + 5% charcoal-treated 
bovine calf serum and incubated overnight at 37°C. After 
24 hours hormone (ethinyi-estradicl, etriol or estron) 
was added to the medium (100 nmol/L) . Cell extracts were 
made 48 hours posttransf ection by the addition of 200 jiL 
lysisbuffer (0.1 M phosphate buffer pH7.8, 0.2% Triton X- 
100) . After incubation for 5 min at 37°C the cell 
suspension was centrifuged (Eppendorf centrifuge, 5 min) 
and 20 uL sample was added to 50 uL luciferase assay 
reagent (Promega) - Light emission was measured in a 
luminometer (Berthold Biolumat) for 10 sec at 562 ran. 

Results . 

CHO cells transiently transfected with the novel ER 
expression vector and a reporter plasmid showed a 3 to 4 
fold increase in luciferase activity in respons to 
ethinyl-estradiol as compared to untreated cells. A 
similar transactivation was obtained upon treatment with 
estriol and estrone. The results indicate not only that 
the novel ER can bind estrogen hormones but also that the 
ligand-activated receptor can bind to the ERE within the 




rat oxytocin promoter and activate transcription of the 
lucif erase reporter gene. 
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(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Akzo ncbel n.v. 

(B) STREET: Velperweg 7 6 

(C) CITY: Arnhem 

(E) COUNTRY: The Netherlands 

(F) POSTAL CODE (ZIP) : 6324 BM 

(G) TELEPHONE: 0412-666379 

(H) TELEFAX: 0412-650592 

(I) TELEX: 37503 akpha nl 

(ii) TITLE OF INVENTION: Novel estrogen receptor 
(iii) NUMBER OF SEQUENCES: 23 

liv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk: 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC- DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 (EPO) 



(2) INFORMATION FOR SEQ ID NO: 1: 



30 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1434 base pairs 

(B) TYPE: nucleic acid 
{C} STRANDEDNESS: double 
<D> TOPOLOGY: linear 



3$ 



MOLECULE TYPE: cDNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 



ATGAATTACA GCATTCCCAG CAATGTCACT AACTTGGAAG GTGGGCCTGG TCGGCAGACC 
60 

ACAAGCCCAA ATGTGTTGTG GCCAACACCT GGGCACCTT? CTCCTTTA3T GGTCCATCGC 
120 

CAGTTATCAC ATCTGTATGC GGAACCTCAA AAGAGTCCCT GGTGTGAAGC AAGATCGCTA 
180 

GAACACACCT TACCTGTAAA CAGAGAGACA CTGAAAAGGA AGGTTAGTGG GAACCGTTGC 
240 

GCCAGCCCTG TTACTGGTCC AGGT T C AAAG AGGGATGCTC ACTTCTGCGC TGTCTGCAGC 
30C 

GATTACGCAT CGGGATATCA CTATGGAGTC TGGTCGTGTG AAGGATGTAA GGCCTTTTTT 
360 

AAAAGAAGCA TTCAAGGACA TAATGATTAT ATTTGTCCAG CTACAAATCA GTGTACAATC 
420 



GATAAAAACC GGCGCAAGAG CTGCCAGGCC TGCCGACTTC GGAAGTGTTA CGAAGTGGGA 
480 

ATGGTGAAGT GTGGCTCCCG GAGAGAGAGA TGTGGGTACC GCCXTGTGCG GAGACAGAGA 
54C 

AGTGCCGACG AGCAGCTGCA CTGTGCCGGC AAGGCCAAGA GAAGTGGCGG CCACGCGCCC 
600 

CGAGTGCGGG AGCTGCTGCT GGACGCCCTG AGCCCCGAGC AGCTAGTGCT CACCCTCCTG 
660 

GAGGCTGAGC CGCCCCATGT GCTGATCAGC CGCCCCAGTG CGCCCTTCAC CGAGGCCTCC 
720 

AT GAT GAT GT CCCTGACCAA GTTGGCCGAC AAGGAGTTGG TACACATGAT CAGCTGGGCC 
780 



AAGAAGATTC CCGGCTTTGT GGAGCTCAGC CTGTTCGACC AAGTGCGGCT CTTGGAGAGC 
340 



TGTTGGATGG AGGTGTTAAT GATGGGGCTG ATGTGGCGCT CAATTGACCA CCCCGGCAAG 
900 



- 28 



CTCA7CTTTG CTCCAGATCT TGTTCTGGAC AGGGATSAGG GGAAATGCGT AGAAGGAATT 
960 

CTGGAAATCT TTGACATGCT CCTGGCAACT ACTTCAAGGT TTCGAGAGTT AAAACTCCAA 
1020 



10 



CACAAAGAAT ATCTC7GTGT CAAGGCCATG ATCCTGCTCA ATTCCAGTAT GTACCCTCTG 
1030 

GTCACAGCGA CCCAGGATGC TGACAGCAGC CGGAAGCTGG CTCACTTGCT GAACGCCGTG 
1140 



15 



ACCGATGCTT TGGTTTGGGT GATTGCCAAG AGCGGCATCT CCTCCCAGCA GCAATCCATG 
1200 



CGCCTGGCTA ACCTCCTGAT GCTCCTGTCC CACGTCAGGC ATGCGAGTAA CAAGGGCATG 
1260 



20 



GAACATCTGC TCAACATGAA GTGCAAAAAT GTGGTCCCAG TGTATGACCT GCTGCTGGAG 
1320 



25 



ATGCTGAATG CCCACGTGCT TCGCGGGTGC AAGTCCTCCA TCACGGGGTC CGAGTGCAGC 
1380 

CCGGCAGAGG ACAGTAAAAG CAAAGAGGGC TCCCAGAACC CACAGTCTCA GXGA 
1434 



30 



35 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1251 base pairs 
(£) TYPE: nucleic acid 
(C) STRANDEDNESS: double 
<D> TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



ATGAATTACA GCATTCCCAG CAAT GT CACT AACTTGGAAG GTGGGCCTGG TCGGCAGACC 
60 

ACAAGCCCAA ATGTGTTGTG GCCAACACCT GGGCACCTTT CTCCTTTAGT GGTCCATCGC 
120 

CAGTTATCAC ATCTGTATGC SGAACCTCAA AAGAGTCCCT GGTGTGAAGC AAGATCGCTA 
180 

GAACACACCT TACCTGTAAA CAGAGAGACA CTGAAAAGGA AGGTTAGTGG GAACCGTTGC 
240 

GCCAGCCCTG TTACTGGTCC AGGTTCAAAG AGGGATGCTC ACTTCTGCGC TGTCTGCAGC 
300 

GATTACGCAT C GG GAT AT CA CTATGGAGTC TGGTCGTGTG AAGGATGTAA GGCCTTTTTT 
360 

AAAAGAAGCA TTCAAGGACA TAATGATTAT ATTTGTCCAG CTACAAATCA GTGTACAATC 
420 

GATAAAAACC GGCGCAAGAG CTGCCAGGCC TGCCGACTTC GGAAGTGTTA CGAAGTGGGA 
480 

ATGGTGAAGT GTGGCTCCCG GAGAGAGAGA TGTGGGTACC GCCTTGTGCG GAGACAGAGA 
540 

AGTGCCGACG AGCAGCTGCA CTGTGCCGGC AAGGCCAAGA GAAGTGGCGG CCACGCGCCC 
600 

CGAGT GCGGG AGCTGCTGCT GGACGCCCTG AGCCCCGAGC AGCTAGTGCT CACCCTCCTG 
660 

GAGGCTGAGC CGCCCCATGT GCTGATCAGC CGCCCCAGTG CGCCCTTCAC CGAGGCCTCC 
720 

AT GAT GAT GT CCCTGACCAA GTTGGCCGAC AAGGAGTTGG TACACATGAT CAGCTGGGCC 
780 

AAGAAGATTC CCGGCTTTGT GGAGCTCAGC CTGTTCGACC AAGTGCGGCT CTTGGAGAGC 
340 

TGTTGGATGG AGGTGTTAAT GATGGGGCTG AT GT GGCGCT CAAT TGA CCA CCCCGGCAAG 
900 




- 30 - 

CTCATCTTTG CTCCAGATCT TGTTCTGGAC AGGGATGAGG GGAAATGCGT AGAAGGAATT 
960 

CTGGAAATCT TTGACATGCT CCTGGCA&CT ACTTCAAGGT TTCGAGAGTT AAAACTCCAA 
5 1020 



10 



CACAAAGAAT ATCTCTGTGT CAAGGCCATG ATCCTGCTCA ATTCCAGTAT GTACCCTCTG 
1080 

GTCACAGCGA CCCAGGATGC TGACAGCAGC CGGAAGCTGG CTCACTTGCT GAACGCCGTG 
1140 



IS 



20 



25 



30 
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ACCGATGCTT TGGTTTGGGT GATTGCCAAG AGCGGCATCT CCTCCCAGCA GCAATCCATG 
1200 

CGCCTGGCTA ACCTCCTGAT GCTCCTGTCC CACGTCAGGC ATGCGAGGTG A 
1251 



(2) INFORMATION FOR SEQ ID NO: 3; 

<i> SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 66 amine acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Cys Ala Val Cys Ser Asp Tyr Ala Sec Gly Tyr His Tyr Gly Val Trp 
15 10 15 

Ser Cys Glu Gly Cys Lys Ala Phe Phe Lys Arg Ser lie Gin Gly Kis 
20 25 30 



Asn Asp Tyr lie Cys Pro Ala Thr Asn Gin Cys Thr lie Asp Lys Asn 
35 40 45 



Arg Arg Lys Ser Cys Gin Ala Cys Arg Leu Arg Lys Cys Tyr 
50 55 60 

Gly Met 
65 

INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 233 amino acids 

(B) TYPE: amino acid 

(C) ST HANDEDNESS : single 

( D ) TOPOLOGY : 1 inea r 

(ii) MOLECULE TYPE: peptide 



(Xi) SEQUENCE DSSCFIPTXCN: SEQ ID NO: 4: 

Leu Val Leu Thr Leu Lea Glu Ala Glu Pro Pro His Val Leu lie Ser 

Arg Pro Ser Ala Pro Fhe Thr Glu Ala Ser Met Met Met Ser Leu Thr 
20 25 30 

Lys Leu Ala Asp Lys Glu Leu Val His Met lie Ser Trp Ala Lys Lys 
35 40 45 

He Pro Gly Phe Val Glu Leu Ser Leu Phe Asp Gin Val Arg Leu Leu 

50 55 6C 

Glu Ser Cys Trp Met Glu Val Leu Met Met Gly Leu Met Trp Arg Ser 

65 70 75 80 

lie A3p His Pro Gly Ly3 Leu lie Phe Ala Pro Asp Leu Val Leu Asp 
85 90 95 



Arg Asp Glu Gly Lys Cys Val Glu Gly lie Leu Glu He Phe Asp Met 



32 



100 



105 no 



Leu Leu Ala Thr Thr Ser Arg Phe Arg GIu Leu Lys Leu Gin His Lys 
115 120 125 

Glu Tyr Leu Cys Val Lys Ala Met He Leu Leu Aan Ser Ser Met Tyr 
130 135 140 

Pro Leu Val Thr Ala Thr Gin Asp Ala Asp 5er Ser Arg Lys Leu Ala 
145 150 155 160 

His Leu Leu Asn Ala Val Tnr Asp Ala Leu Val Trp Val He Ala Lys 
165 "0 173 

Ser Gly He Ser Ser Gin Gin Gin Ser Met Arg Leu Ala Asn Leu Leu 
180 135 193 

Met Leu Leu Ser His Val Arg His Ala Ser Asn Lys Gly Met Glu His 

195 200 205 

Leu Leu Asn Met Lys Cya Lys Asn Val Val Pro Val Tyr Asp Leu Leu 
210 215 220 

Leu Glu Met Leu Asn Ala His Val Leu 
225 230 

(2) INFORMATION TOR SEQ ID NC : 5: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 477 amino a -ids 

( B) TY?E: amino acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY; unknown 

<ii> MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



Met Asn Tyr Ser lie Pro Ser Asn Val Thr Asn Leu GIu Gly Gly Pro 
15 10 1* 

Gly Arg Gin Thr Thr Ser Pro Asn Val Leu Trp Pro Thr Pro Gly His 
2C 25 30 

Leu Ser Pro Leu Val Val His Arg Gin Leu Ser His Leu Tyr Ala Glu 
35 4G 45 

Pro Gin Lys Ser Pro Trp Cys Glu Ala Arg Ser Leu Glu His Thr Leu 
50 55 60 

Pro Val Asn Arg Glu Thr Leu Lys Arg Lys Val Ser Gly Asa Axg Cys 
65 70 75 80 

Ala ser Pro Val Thr Gly Pro Gly Ser Lys Arg Asp Ala His Phe Cys 
35 90 95 

Ala Val Cys Ser Asp Tyr Ala Ser Gly Tyr His Tyr Gly Val Trp Ser 
100 105 HO 

Cys Glu Gly Cys Lys Ala Phe Phe Lys Arg Ser He Gin Gly His Asn 
115 120 125 

Asp Tyr He Cys Pro Ala Thr Asn Gin Cys Thr He Asp Lys Asn Arg 
130 135 

Arg Lys Ser Cys Gin Ala Cys Arg Leu Arg Lys Cys Tyr Glu Val Gly 
145 150 155 160 

Met Val Lys Cys Gly Ser Arg Arg Glu Arg Cys Gly Tyr Arg Leu Val 
165 170 175 

Arg Arg Gin Arg Ser Ala Asp Glu Gin Leu His Cys Ala Gly Lys Ala 
180 185 190 

Lys Arg Ser Gly Gly His Ala Pro Arg Val Arg Glu Leu Leu Leu Asp 

195 200 205 



Ala Leu Ser Pro 
210 

Pro His Val Leu 
225 

Met Met Met Ser 



lie Ser Trp Ala 
2cQ 

Asp Gin Val Arg 
275 

Gly Leu Met Trp 
290 

Pre Asp Leu Val 
305 

Leu Glu lis Phe 



Leu Lys Leu Gin 
340 

Leu Asn Ser Ser 
355 

Ser Ser Arg Lys 
370 

Val Trp Val lie 
385 

Arg Leu Ala Asn 



Glu Gin Leu Val 

215 

He Ser Arg Pro 
23C 

Leu Thr Lys Leu 
245 

Lys Lys He Pro 



Leu Leu Glu Ser 
280 

Arg Ser He Asp 
295 

Leu Asp Arg Asp 

310 

Asp Met Leu Leu 
325 

His Lys Glu Tyr 



Met Tyr Fro Leu 
360 

Leu AJ-a His Leu 

375 

Ala Lys Ser Gly 
390 

Leu Leu Met Leu 
405 



Leu Thr Leu Leu 

220 

Ser Ala ?rc Phe 
235 

Ala Asp Lys Glu 
250 

Gly Phe Val Glu 
265 

Cys Trp Met Glu 



His Pro Gly Lys 
300 

Glu Gly Lys Cys 
315 

Ala Thr Thr Ser 

330 

Leu Cys Val Lys 
345 

Val Thr Ala Thr 



Leu Asn Ala Val 
380 

He Ser Ser Gin 
395 

Leu Ser His Val 
410 



Glu PJLa Glu Pre 



Thr Glu Ala Ser 
240 

Leu Val His Met 
255 

Leu Ser Leu Phe 

270 

Val Leu Met Met 

285 

Leu lie Phe Ala 



Val Glu Gly He 
320 

Arg Phe Arg Glu 
335 

Ala Met He Leu 
350 

Gin Asp Ala Asp 
365 

Thr Asp Ala Leu 



Gin Gin Ser Met 
400 

Axg His Ala Ser 

415 



v, 1 \ . . tT;\/t:t'W/W|-,D K I j >v» I J K 
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Asn Lys Gly Met Glu His Leu Leu Asn Met Lys Cys Lys Asn Val Val 
420 425 430 

Pro Val Tyr Asp Leu Leu Leu Glu Met Lt»u Asn Ala His Val Leu Arg 
435 440 445 

Gly Cys Lys Ser Ser lie Thr Gly Ser Glu Cys Ser Pro Ala Glu Asp 
450 455 460 

Ser Lys Ser Lys Glu Gly Ser Gin Asn Pro Gin Ser Gin 
465 470 475 



15 



20 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 416 amino acids 

( B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 



(ii) MOLECULE TYPE: protein 



25 



30 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Asn Tyr Ser lie Pro Ser Asn Val Thr Asn Leu Glu Gly Gly Pro 
15 10 15 

Gly Arg Gin Thr Thr Ser Pro Asn Val Leu Trp Pro Thr Pro Gly His 
20 25 30 



35 



Leu Ser Pre Leu Val Val His Arg Gin Leu Ser His Leu Tyr Ala Glu 
35 40 45 



Pro Gin Lys Ser Pro Trp Cys Glu Ala Arg Ser Leu Glu His Thr Leu 
50 55 60 



- 36 - 

Pro Val Asn Arg Giu Thr Leu Lys Arg Lys Val Ser Gly Asn Arg Cys 
65 70 75 90 

Ala Ser Pro Val Thr Gly Pro Sly Ser Lys Arg Asp Ala His Phe Cys 
35 ^0 55 

Ala Val Cys Ser Asp Tyr Ala Ser Gly Tyr His Tyr Gly Val Trp Ser 
100 105 110 

Cys Giu Gly Cys Lys Ala Phe Phe Lys Arg Ser lie Gin Gly His Asn 
US 120 125 

Asp Tyr lie Cys Pro Ala Thr Asn Gin Cys Thr lie Asp Lys Asn Arg 
130 135 140 

Arg Lys Ser Cys Sin Ala Cys Arg Leu Arg Lys Cys Tyr Giu Val Gly 
145 150 155 160 

Met Val Lys Cys Gly Ser Arg Arg Giu Arg Cys Gly Tyr Arg Lett Val 
165 170 175 

Arg Arg Gin Arg Ser Ala Asp Giu Gin Leu His Cys Ala Gly Lys Ala 
180 185 1^0 

Lys Arg Ser Gly Gly His Ala Pro Arg Val Arg Giu Leu Leu Leu Asp 
195 200 205 

Ala Leu Ser Pro Giu Gin Leu Val Leu Thr Leu Leu Giu Ala Giu Pro 

210 215 220 

Pro His Val Leu lie Ser Arg Pro Ser Ala Pro Phe Thr Giu Ala Ser 
225 230 235 240 

Met Met Met Ser Leu Thr Lys Leu Ala Asp Lys Giu Leu Val His Met 
245 250 255 



He Ser Trp Ala Lys Lys He Pro Gly Phe Val Giu Leu Ser Leu Phe 

260 265 270 



Asp Gin Val Arg 
275 

Gly Leu Met Trp 

250 

Pro Asp Leu Val 
305 

Leu Glu lie ?he 



Leu Lys Leu 31 n 
340 

Leu Asn Ser Ser 

355 

Ser Ser Axg Lys 

370 

Val Trp Val lie 
385 

Arg Leu Ala Asn 



Leu Leu Glu Ser 
280 

Arg Ser lie Asp 
295 

Leu Asp Arg Asp 

31C 

Asp Met Leu Leu 
325 

His Lys Glu Tyr 



Met Tyr Fro Leu 
360 

Leu Ala His Leu 
375 

Ala Lys Ser Gly 

390 

Leu Leu Met Leu 
405 



Cys Trp Met Glu 



His Pro Gly Lys 
300 

Glu Gly Lys Cys 
315 

Ala Thr Thr Ser 
330 

Leu Cys Val Lys 
345 

Vai Thr Ala Thr 



Leu Asn Ala Val 
3B0 

lie Ser Ser Gin 
395 

Leu Ser His Val 
410 



Val Leu Met Met 
235 

Leu lie Phe Ala 



Val Glu Gly He 
320 

Arg Phe Arg Glu 
335 

Ala Met He Leu 
35? 

Gin Asp Ala Asp 
335 

Thr Asp Ala Leu 



Gin Gin Ser Met 
400 

Arg His Ala Arg 
415 



INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 9 base pairs 

(B) TYPE: nucleic acid 
<C) STRAKDEDKESS : both 
<D) TOPOLOGY : unknown 



(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 

GGIGAYGAKG CWTCIGGITG YCAYTAYGG 
29 

(2) INFORMATION FOR SEQ ID NO: 8: 

fi) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 
(CI STRANDEDNESS: single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 

AAGCCTGGSA YICKYTTIGC CCAIYTIAT 
29 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 

TGTTACGAAG TGGGAATGGT GA 
22 
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(2) INFORMATION ?CR SEC ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

CC) STRANDEDNS3S: single 

CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
TTGACACCAG ACCAACTGGT AATG 

;2) INFORMATION FOR SEQ ID NO: 11: 

ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS : single 
(D; TOcOLOGYi linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 

GGTGGCGACG ACT CCT GGAG CCCG 
24 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 
(A} LENGTH: 22 base pairs 
CB) TYPE: nucleic acid 
(C) STRANDEDNE3S : sxnyle 
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( D ) TOPOLOGY : linear 
[it) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 

GTACACXGAT TTGTAGCTGG AC 
22 

(2) INFORMATION FOR SEQ ID NO: 13: 

U) SEQUENCE CHARACTERISTICS: 
{A) LENGTH; 20 base pairs 
<B> TYPE: nucleic acid 
(C) STRANDEDNESS: single 
< D ) TOPOLOGY : li ne a r 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 

COAT GAT GAT GTCCCTGACC 
20 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 

TCGCATGCCT GACGTGGGAC 
20 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 
{A) LENGTH: 24 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNES3 : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



Cxi) SEQUENCE DESCRIPTION*: SEQ ID NO 

GGCSTCCAGC ATCTCCAGSA RCAG 
24 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO 

GGAAGCTGGC TCACTTGCTG 
20 

(2) INFORMATION FOR SEQ ID NO: 17: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: ZO base pairs 
(S) TYPE: nucleic acid 
(C) STRANDEDMESS: single 
(CO TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: 3EQ ID 

TCTTGTTCTG GACAGGGATG 
20 

(2) IN FORMAT I OK FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 2C base pairs 
(S) TYPE: nucleic acid 
<C) STRANDEDNESS: Si.ngl$ 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEC ID 

GCATGGAACA TCTGCTCAAC 
20 

(2) I N FORMAT I ON FOR SEQ ID NO: 19: 

U) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 21 base pairs 

(B) TYPE, nucleic acid 

(C) STRANDEDNESS: single 
ID) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 



AGCAAGTTCA GCCTGTTAAG T 
21 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1257 base pairs 
<B> TYPE: nucleic acid 
<C> STRANDEUNESS: double 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 



ATGAATTACA GCATTCCCAG CAATGTCACT AACTTGGAAG GTGGGCCTGG TCGGCAGACC 
60 



ACAAGCCCAA ATGTGTTGTG GCCAACACCT GGGCACCTTT CTCCTTTAGT GGTCCATCGC 
120 



CAGTTATCAC ATCTGTATGC GGAACCTCAA AAGAGTCCCT GGTGTGAAGC AAGATCGCTA 
130 



GAACACACCT TACCTGTAAA CAGAGAGACA CTGAAAAGGA AGGTTAGTGG GAACCGTTGC 
240 



GCCAGCCCTG TTACTGGTCC AGGTTCAAAG AGGGATGCTC ACTTCTGCGC TGTCTGCAGC 
300 



GATTACGCAT CGGGATATCA CTATGGAGTC TGGTCGTGTG AAGGATGTAA GGCCTTTTTT 
360 



AAAAGAAGCA TTCAAGGACA TAATGATTAT ATTTGTCCAG CTACAAATCA GTGTACAATC 
420 

GATAAAAACC GGCGCAAGAG CTGCCAGGCC TGCCGACT7C GGAAGTGTTA CGAAGTGGGA 
480 

ATGG7GAAGT GTGGCTCCCG GAGAGAGAGA TGTGGGTACC GCCTTGT SCG GAGACAGAGA 
540 

ACTGCCGACG AGCAGCTGCA CTGTGCCGGC AA5GCCAAGA GAAGTGGCGG CCACGCGCCC 
600 

CGAGTGCGGG AGCTGCTGCT GGACGCCCTG AGCCCCGAGC AGCTAGTGCT CACCCTCCTG 
660 

GAGGCTGAGC CGCCCCATGT GCTGATCAGC CGCCCCAGTG CGCCCTTCAC CGAGGCGTCC 
720 



AT GAT GAT GT CCCTGACCAA GTTGGCCGAC AAGGAGTTGG TACACATGAT CAGCTGGGCC 
780 



AAGAAGAT7C CCGGCTTTGT GGAGCTCAGC CTGTTCGACC AAGTGCGGCT CTTGGAGAGC 
840 



TGTTGGATGG AGGTGTTAAT GATGGGGCTG ATGTGGCGCT CAATTGACCA CCCCGGCAAG 
900 

CTCATCTTtG CTCCAGATCT TGTTGTGGAC AGGGATGAGG GGAAATGCGT AGAAGGAATT 
960 



CTGGAAATCT TTGACATGCT CCTGGCAACT ACTTCAAGGT TTCGAGAGTT AAAACTCCAA 
1C20 



CACAAAGAAT ATCTCTGTGT CAAGGC CAT G ATCCTGCTCA ATTCCAGTAT GTACCCTCTG 
1080 



GTCACAGCGA CCCAGGATGC 7GACAGCAGC CGGAAGCTGG CTCACTTGCT GAACGCCGTG 
1140 



ACCGATGCTT TGGTTTGGGT GAT T G C CAAG AGCGGCATCT CCTCCCAGCA GCAATCCATG 
1200 



CGCCTGGCTA ACCTCCTGAT GCTCCTGTCC CACGTCAGGC A7GCGAGGTC TGCCTGA 
1257 



INFORMATION FOR SEQ 13 NO: 21: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 418 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: 3EQ ID NO: 21: 

Met Asn Tyr Ser II* Pro Sec Asn Val Thr Asn Leu Glu Gly Gly Pro 
1 5 10 15 

Gly Arg Gin Thr Thr Ser Pro Asn Val Leu Trp Pro Thr Pro Gly His 
20 25 30 

Leu Ser Pro Leu Val Val His Arg Gin Leu Ser His Leu Tyr Ala Glu 
35 40 45 

Pro Gin Lys Ser Pro Trp Cys Glu Ala Arg Ser Leu Glu His Thr Leu 
50 55 60 

Pro Val Asn Arg Glu Thr Leu Lys Arg Lys Val Ser Gly Asn Arg Cys 
65 70 "75 80 

Ala Ser Pro Val Thr Gly Pro Gly Ser Lys Arg Asp Ala His Phe Cys 
85 90 95 

Ala Val Cys Ser Asp Tyr Ala Ser Gly Tyr His Tyr Gly Val Trp Ser 

100 105 HO 

Cys Glu Gly Cys Lys Ala Phe Phe Lys Arg Ser lie Gin Gly His Asn 
115 120 125 



Asp Tyr He Cys Pro Ala Thr Asn Gin Cys Thr He Asp Lys Asn Arg 
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130 



135 



140 



Lys Ser Cys Gin Ala Cys Arg Leu Arg Lys Cys Tyr Glu Val Gly 



Arg Lys Ser Cys 
145 



150 



155 



160 



10 



Met Val Lys Cys Gly Ser Arg Arg Glu Axg Cys Gly Tyr Arg Leu Val 
165 170 175 

Arg Arg Gin Arg Ser Ala Asp Glu Gin Leu His Cys Ala Gly Lys Ala 
ISO 135 190 

Lys Arg Ser Gly Gly His Ala Pro Arg Val Arg Glu Leu Leu Leu Asp 
195 2G0 205 



15 



Ala Leu Ser Pro Glu Gin Leu Val Leu Thr Leu Leu Glu Ala Glu Pro 
210 215 220 



20 



Pro His Val Leu lie Ser Arg Pro Ser Ala Pro Phe Thr Glu Ala Ser 

235 240 



225 



230 



Met Met Met Ser Leu Thr Lys Leu Ala Asp Lys Glu Leu Val His Met 
245 250 255 



25 



lie Ser Trp Ala Lys Lys lie Pro Gly Phe Val Glu Leu Ser Leu Phe 

260 265 270 

Asp Gin Val Arg Leu Leu Glu Ser Cys Trp Met Glu Val Leu Met Met 
275 2BQ 285 



30 



Gly Leu Met Trp Arg Ser lie Asp His Pro Gly Lys Leu He Phe Ala 
290 295 300 



35 



Pro Asp Leu Val Leu Asp Arg Asp Glu Gly Lys Cys Val Glu Gly He 
305 310 315 320 

Leu Glu He Phe Asp Met Leu Leu Ala Thr Thr Ser Arg Phe Arg Glu 

325 330 335 



Leu Lys Leu Gin His Lys Glu Tyr Leu Cys Val Lys Ala Met He Leu 
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340 



345 



350 



Leu Asn Ser Ser Met: Tyr Fro Leu Val Thr Ala Thr Gin Asp Ala Asp 
255 360 365 

3er Ser Arg Lys Leu Ala His Leu Leu Asn Ala Val Thr Asp Ala Leu 
37fj 375 380 

Val Trp Val lie Ala Lys Ser Gly He Ser Ser Gin Gin Gin Ser Met 
335 390 395 400 



Arg Leu Ala Asn Leu Leu Met Leu Leu Ser His Val Arg His Ala Arg 
405 <10 415 



Ser Ala 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 
<A) LENGTH: 34 base pairs 

(B) TYPE*, nucleic ac;.d 

(C) STRATtfDEDNESS : single 
(D? TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

CTTGGATCCA TAGCCCTGCT GTGATGAATT ACAG 
34 

(2) INFORMATION FOR SEQ ID NO: 23: 

(x; SEQUENCE CHARACTERISTICS: 
(A; LENGTH: 33 base pairs 
f B) TYPE: nucleic acid 
:C/ STRANDEDNESS: single 



_a > - • > 
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( D ) TOPOLOGY : 1 inear 



fix) MOLECULE TYPE; cDNA 



(xi) SEQUENCE DESCRIPTION: SSQ ID NO: 23 ! 



GATGGATCCT CACCTCAGGG CCAGGCGTCA CTG 
10 33 



Claims : 



Isolated DNA encoding a protein having an N- terminal 
domain, a DNA-binding domain and a ligand-bmdmg 
domain, wherein the anino acid sequence of said DNA- 
binding domain of said protein exhibits at least 80% 
homology with the amino acid sequence shown in SEQ 
ID NO: 3, and the amino acid sequence of said ligand- 
binding domain of said protein exhibits at least^70% 
homology wich the amino acid sequence shewn in Sr,Q 
ID NO: 4 . 

Isolated DMA according to claims 1, characterized in 
that the amino acid sequence of said DNA-binding 
domain of said protein exhibits at least 90%, 
preferably 9 5%, more preferably 98%, most preferably 
100% homology wich the amino acid sequence shown in 
SEQ ID NO: 3. 

isolated DNA according to claims 1 or 2, 
characterized in that the amino acid sequence of 
said ligand-binding domain of said protein exhibits 
at least 75%, preferably 80%, more preferably 90%, 
most preferably 100% homology with the amino acid 
sequence shown in SEQ ID NO: 4. 

Isolated DNA according to claims 1 to 3, said DNA 
encoding a protein comprising the amino acid 
sequence of SEQ ID NO: 5, SEQ ID NO; 6 or SEQ ID 
NO : 2 1 . 

Isolated DNA according to claims 1 to 4, 
characterized in that: said DNA comprises the nuclei 
acid sequence of SEQ ID NO:l, SEQ ID NO: 2 or SEQ ID 
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NO : 2 0 . 

6. A recombinant expression vector comprising the DNA 
according to any of the claims 1 to 5. 

7. A cell transfected with DNA according to claims 1 tc 
5 or an expression vector according to claim €. 

8. A cell according to claim 7 which is a stable 
transfected cell line which expresses the steroid 
receptor protein according to any of the claims 9 to 
11 . 

9. Protein encoded by DNA according to claims 1 to 5 or 
an expression vector according to claim 6. 

10. Protein according to claim 9, said protein 
comprising the amino acid sequence of SEQ ID NO: 5, 
SEQ ID NO: 6 or SEQ ID NO: 21. 

11. Chimeric protein having an N- terminal domain, a DNA- 
binding domain, and a ligand-binding domain, 
characterized in that at least one of said domains 
of said chimeric protein originates from a protein 
according to claims 9 or 10, and at least one of the 
other domains of said chimeric protein originates 
from another receptor protein from the nuclear 
receptor superfamily, provided that the DNA-binding 
domain and zhe ligand-binding domain of said 
chimeric protein originates from different proteins. 



12. DNA encoding a protein according tc claim 
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13. Use of a DMA according to claims 1 to 5 or 12, an 
expression vector according to claim 6, a cell 
according to claim 7 or 8 or a protein according to 
claim 9 to 11 in a screening assay for 
identification of new drugs, 

14. A method for identifying functional ligands for the 
protein according to claims 9 to 11, said method 
comprising the steps of 

a) introducing into a suitable host cell 1) DNA 
according to claims 1 to 5 or 12, and 2) a 
suitable reporter gene functionally linked to 
an operative hormone response element, said HRE 
being able to be activated by the DNA-binding 
domain of the protein encoded by said DNA; 

b) bringing the host cell from step a) into 
contact with potential ligands which will 
possibly bind to the ligand-binding domain of 
the protein encoded by said DNA from step a); 

c) monitoring the expression of the protein 
encoded by said reporter gene of step a) . 
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ABSTRACT 



The present invention relates 
novel estrogen receptors, the proteins 
DNA, chimeric receptors comprising parts 
receptors and uses thereof- 



to isolated DNA encoding 
encoded by said 
of said novel 
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